Mdc/add protpardelle#274
Conversation
Wrap sequence-conditioned all-atom Protpardelle-1c models (task "ai-allatom", e.g. cc89.yaml) under the StructureModelWrapper protocol. featurize() derives sequence conditioning (aatype/seq_mask/residue_index/ chain_index/atom37 mask) from an Atomworks structure's chain_info, and step() runs the full reverse-diffusion trajectory internally via Protpardelle.sample(gt_aatype=...), returning [batch, atoms, 3] coords. - Register PROTPARDELLE_AVAILABLE / require_protpardelle / check_protpardelle_available (catching OSError from protpardelle.env). - Add StructurePredictor.PROTPARDELLE. - Tests build a small random ai-allatom model (no weights needed); cover sequence extraction, featurization shapes, prior init, protocol conformance, and an end-to-end short-trajectory sampling smoke test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sn't just give us unguided samples
## Summary Completes the Protpardelle debugging work from ENG-75 / #257 on top of `mdc/add-protpardelle`. This PR intentionally targets `mdc/add-protpardelle` so the review contains only the final Protpardelle fixes, not the whole in-progress integration branch. Key changes: - Restores the EDM sampler/model-wrapper call shape to the protocol-compatible `step(noisy_state, t_hat, features=features)` after removing the temporary extra `eps` argument. - Implements Protpardelle step-time noise handling via `_expand_noise_level()`, broadcasting scalar or per-batch EDM timesteps to Protpardelle's expected `B x L` tensor. - Moves Protpardelle step inputs/conditioning tensors onto the wrapper device before the model forward pass. - Keeps `prot_lens_per_chain` on CPU when calling Protpardelle's sampling helper to avoid CPU/GPU device mismatch in helper-created residue indices. - Maps Atomworks selenomethionine atom name `SE` into Protpardelle atom37's methionine sulfur slot `SD`. - Fixes `ProtpardelleConditioning` immutability so dataclass construction can complete before selected conditioning fields become frozen. - Allows packaged Protpardelle config/data under `src/sampleworks/data/**` through `.actlignore` so ACTL sync includes the runtime YAML config. - Re-enables the slow `step()` behavior test and adds coverage for the MSE selenium atom mapping. ## Validation Local static checks: - `uvx ty check src/sampleworks/models/protpardelle/wrapper.py src/sampleworks/core/samplers/edm.py tests/models/protpardelle/test_protpardelle_wrapper.py` - `uvx ruff check src/sampleworks/models/protpardelle/wrapper.py src/sampleworks/core/samplers/edm.py tests/models/protpardelle/test_protpardelle_wrapper.py` Remote ACTL / Protpardelle environment checks: - `pixi run -e protpardelle-dev pytest tests/models/protpardelle -m "not slow"` - `27 passed, 1 deselected` - `pixi run -e protpardelle-dev pytest tests/models/protpardelle/test_protpardelle_wrapper.py::TestStep::test_step_returns_coords` - `1 passed` - Exact reduced Protpardelle guidance smoke command from #257 using the shared checkpoint and 1VME inputs - completed successfully with `Guidance run successfully!` - wrote results to `output/protpardelle-smoke-guided/` Non-blocking warnings observed during the smoke run were expected environment warnings for unavailable optional model/tool paths and missing mirror environment variables; they did not prevent the Protpardelle run from completing. Refs #257. Co-authored-by: xraymemory <me.anzuoni@gmail.com>
ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (5)
💤 Files with no reviewable changes (1)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds Protpardelle support across config, wrapper logic, guidance wiring, availability checks, fixtures, and tests. Also updates Pixi metadata, bundled data handling, and runner-test inputs. ChangesProtpardelle wrapper and guidance integration
Estimated code review effort: 4 (Complex) | ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers: 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
tests/models/protpardelle/test_protpardelle_wrapper.py (1)
147-160: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winKeep these assertions at the public wrapper boundary.
These tests reach into
_build_sampling_kwargs()and_convert_to_atom37()directly, which makes the suite brittle to internal refactors that preservefeaturize(),initialize_from_prior(), andstep()behavior. Either move the coverage to public API outcomes or promote the helpers if they are intended contract. As per coding guidelines,tests/**/*.py: "Write black-box tests that verify behavior, not implementation."Also applies to: 287-320
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/models/protpardelle/test_protpardelle_wrapper.py` around lines 147 - 160, The tests are asserting private helper behavior via ProtpardelleWrapper._build_sampling_kwargs() and _convert_to_atom37(), which makes them brittle to internal refactors. Move this coverage to black-box assertions around the public wrapper methods featurize(), initialize_from_prior(), and step(), or explicitly promote those helpers if they are meant to be part of the contract. Keep the checks focused on observable outputs and behavior at the wrapper boundary rather than internal implementation details.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pyproject.toml`:
- Around line 139-141: The pyproject.toml currently defines the
dependency-overrides table twice under tool.pixi.pypi-options, which breaks TOML
parsing. Consolidate the duplicate [tool.pixi.pypi-options.dependency-overrides]
blocks into a single table, and keep the pandas and gemmi pins together there
only if they are intended to apply globally.
In `@src/sampleworks/models/protpardelle/wrapper.py`:
- Around line 707-709: The self-conditioning cached in
features.conditioning.x_self_conditioning is being stored with its autograd
history intact, which can carry the prior graph into later sampling steps.
Update the assignment in step() to cache a detached version of x_self_cond
before writing it to features.conditioning.x_self_conditioning, so repeated
iterations do not retain gradients across steps.
In `@src/sampleworks/utils/guidance_script_utils.py`:
- Around line 223-230: In the Protpardelle branch of guidance_script_utils, the
config_path is still built from a hardcoded src/sampleworks/data path, which
breaks when the package is installed and run from outside the repo. Update the
ProtpardelleWrapper config_path handling to load cc89_epoch415.yaml as a bundled
package resource using the package’s resource-loading APIs instead of resolving
a filesystem path relative to the current working directory. Keep the change
localized to the model-loading logic around StructurePredictor.PROTPARDELLE and
ProtpardelleWrapper.
- Around line 63-67: The Protpardelle fallback in the import block for
ProtpardelleWrapper is too narrow and can still crash non-Protpardelle runs.
Update the try/except around the ProtpardelleWrapper import to catch the
additional failure modes noted in imports.py, not just ImportError, and keep the
existing fallback assignment/logging. Also make the bundled YAML path in the
guidance script resolution independent of the current checkout by resolving it
from the package/module location instead of using a repo-relative path.
In `@src/sampleworks/utils/imports.py`:
- Around line 265-270: The install hint in require_any_model() is stale because
the availability check now includes PROTPARDELLE_AVAILABLE, but the
default_message still only mentions Boltz, Protenix, and RF3. Update the
default_message text in src/sampleworks/utils/imports.py to include Protpardelle
alongside the other supported model options so the decorator’s remediation
guidance matches the current logic.
- Around line 187-230: The bare decorator usage of require_protpardelle is
broken because the function object is being passed into the message parameter,
so the wrapper is not applied. Update require_protpardelle in imports.py to
support both `@require_protpardelle` and `@require_protpardelle`("...") by detecting
when the first argument is a callable vs a custom message, or remove the bare
form from the examples/docstring so only the parenthesized usage is advertised.
In `@tests/models/protpardelle/conftest.py`:
- Around line 17-21: The `os.environ.setdefault(...)` usage is eagerly creating
a temp directory via `tempfile.mkdtemp(...)` even when
`PROTPARDELLE_MODEL_PARAMS` is already set, so update the setup in `conftest.py`
to only call `mkdtemp` when the env var is missing. Apply the same lazy pattern
in `test_protpardelle_wrapper.py` where the same `setdefault`/`mkdtemp` usage
appears, using the existing `PROTPARDELLE_MODEL_PARAMS` guard to avoid
allocating untracked temp dirs on import.
In `@tests/runs/test_runner.py`:
- Around line 29-30: The argv assertion in the test is not hermetic because
runner._build_argv() can switch to a direct Python executable when a baked pixi
env is detected. Update the test setup for the affected assertion in
test_runner.py to force the pixi code path before checking the ["pixi", "run",
"-e", "rf3", "python"] prefix, using the existing test fixture/monkeypatch
around runner._build_argv() so the environment detection cannot change the
expected argv.
---
Nitpick comments:
In `@tests/models/protpardelle/test_protpardelle_wrapper.py`:
- Around line 147-160: The tests are asserting private helper behavior via
ProtpardelleWrapper._build_sampling_kwargs() and _convert_to_atom37(), which
makes them brittle to internal refactors. Move this coverage to black-box
assertions around the public wrapper methods featurize(),
initialize_from_prior(), and step(), or explicitly promote those helpers if they
are meant to be part of the contract. Keep the checks focused on observable
outputs and behavior at the wrapper boundary rather than internal implementation
details.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8a25473f-58ce-43f8-b062-7b353055864f
⛔ Files ignored due to path filters (1)
pixi.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
.actlignorepyproject.tomlsrc/sampleworks/cli/guidance.pysrc/sampleworks/core/samplers/edm.pysrc/sampleworks/data/cc89_epoch415.yamlsrc/sampleworks/models/protpardelle/__init__.pysrc/sampleworks/models/protpardelle/wrapper.pysrc/sampleworks/utils/guidance_constants.pysrc/sampleworks/utils/guidance_script_arguments.pysrc/sampleworks/utils/guidance_script_utils.pysrc/sampleworks/utils/imports.pytests/models/protpardelle/__init__.pytests/models/protpardelle/conftest.pytests/models/protpardelle/test_protpardelle_wrapper.pytests/runs/test_runner.py
| def require_protpardelle(message: str | None = None) -> Callable[[F], F]: | ||
| """Decorator to require Protpardelle model availability. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| message: str, optional | ||
| Custom error message. If None, uses default message. | ||
|
|
||
| Returns | ||
| ------- | ||
| Callable | ||
| Decorator function | ||
|
|
||
| Examples | ||
| -------- | ||
| >>> @require_protpardelle | ||
| ... def sample_protpardelle(): | ||
| ... pass | ||
|
|
||
| >>> @require_protpardelle("Custom error message") | ||
| ... def custom_function(): | ||
| ... pass | ||
| """ | ||
| default_message = ( | ||
| "Protpardelle model wrapper is not available. Install with: " | ||
| "pixi install -e protpardelle" | ||
| ) | ||
|
|
||
| def decorator(func: F) -> F: | ||
| @functools.wraps(func) | ||
| def wrapper(*args: Any, **kwargs: Any) -> Any: | ||
| if not PROTPARDELLE_AVAILABLE: | ||
| error_msg = message or default_message | ||
| try: | ||
| import pytest | ||
|
|
||
| pytest.skip(error_msg) | ||
| except ImportError: | ||
| raise ImportError(error_msg) from None | ||
| return func(*args, **kwargs) | ||
|
|
||
| return wrapper # type: ignore | ||
|
|
||
| return decorator |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
git ls-files src/sampleworks/utils/imports.py
wc -l src/sampleworks/utils/imports.py
sed -n '150,260p' src/sampleworks/utils/imports.py | cat -nRepository: diff-use/sampleworks
Length of output: 3914
Support the bare decorator form or remove it from the docs. @require_protpardelle passes the function object into message, so the example does not wrap the target function. Either accept both decorator forms or change the examples to require parentheses.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/sampleworks/utils/imports.py` around lines 187 - 230, The bare decorator
usage of require_protpardelle is broken because the function object is being
passed into the message parameter, so the wrapper is not applied. Update
require_protpardelle in imports.py to support both `@require_protpardelle` and
`@require_protpardelle`("...") by detecting when the first argument is a callable
vs a custom message, or remove the bare form from the examples/docstring so only
the parenthesized usage is advertised.
There was a problem hiding this comment.
@coderabbitai make an issue to track this as an issue, do not implement it. Let one of the humans implement it.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
gh issue create \
--repo diff-use/sampleworks \
--title "require_protpardelle decorator does not support bare `@decorator` usage" \
--assignee marcuscollins \
--body "## Summary
In \`src/sampleworks/utils/imports.py\`, the \`require_protpardelle\` decorator factory (added around lines 187-230) only supports the parenthesized form \`@require_protpardelle(\"message\")\`. When used as a bare decorator, \`@require_protpardelle\`, the decorated function object is passed in as the \`message\` argument instead of being wrapped, so the function is never actually wrapped/protected. The docstring's first example (\`@require_protpardelle\`) is therefore misleading and does not work as documented.
## Rationale
The same pattern is used for the other \`require_*\` decorators (\`require_boltz\`, \`require_protenix\`, \`require_rf3\`) in the same file, so they likely share this issue, but this report focuses on \`require_protpardelle\` per the originating review comment.
## Affected area
- File: \`src/sampleworks/utils/imports.py\`
- Function: \`require_protpardelle\` (and docstring examples)
## Suggested fix
Either:
1. Update \`require_protpardelle\` to detect when the first positional argument is a callable (bare decorator usage) vs. a string message, and branch accordingly to support both \`@require_protpardelle\` and \`@require_protpardelle(\"message\")\`, or
2. Remove/correct the bare-decorator example in the docstring so only the parenthesized form \`@require_protpardelle()\` is documented/supported.
## Acceptance criteria
- [ ] \`require_protpardelle\` works correctly when used as \`@require_protpardelle\` (bare) or docstring is updated to remove that example.
- [ ] \`require_protpardelle(\"custom message\")\` continues to work as before.
- [ ] Docstring examples accurately reflect supported usage.
## References
- PR: https://github.com/diff-use/sampleworks/pull/274
- Originating review comment: https://github.com/diff-use/sampleworks/pull/274#discussion_r3493412905
- Requested by: `@marcuscollins`"Length of output: 2450
🏁 Script executed:
#!/bin/bash
set -euo pipefail
gh issue create \
--repo diff-use/sampleworks \
--title "require_protpardelle decorator does not support bare `@decorator` usage" \
--body "## Summary
In \`src/sampleworks/utils/imports.py\`, the \`require_protpardelle\` decorator factory (added around lines 187-230) only supports the parenthesized form \`@require_protpardelle(\"message\")\`. When used as a bare decorator, \`@require_protpardelle\`, the decorated function object is passed in as the \`message\` argument instead of being wrapped, so the function is never actually wrapped/protected. The docstring's first example (\`@require_protpardelle\`) is therefore misleading and does not work as documented.
## Rationale
The same pattern is used for the other \`require_*\` decorators (\`require_boltz\`, \`require_protenix\`, \`require_rf3\`) in the same file, so they likely share this issue, but this report focuses on \`require_protpardelle\` per the originating review comment.
## Affected area
- File: \`src/sampleworks/utils/imports.py\`
- Function: \`require_protpardelle\` (and docstring examples)
## Suggested fix
Either:
1. Update \`require_protpardelle\` to detect when the first positional argument is a callable (bare decorator usage) vs. a string message, and branch accordingly to support both \`@require_protpardelle\` and \`@require_protpardelle(\"message\")\`, or
2. Remove/correct the bare-decorator example in the docstring so only the parenthesized form \`@require_protpardelle()\` is documented/supported.
## Acceptance criteria
- [ ] \`require_protpardelle\` works correctly when used as \`@require_protpardelle\` (bare) or docstring is updated to remove that example.
- [ ] \`require_protpardelle(\"custom message\")\` continues to work as before.
- [ ] Docstring examples accurately reflect supported usage.
## References
- PR: https://github.com/diff-use/sampleworks/pull/274
- Originating review comment: https://github.com/diff-use/sampleworks/pull/274#discussion_r3493412905
- Requested by: `@marcuscollins`"Length of output: 2226
@marcuscollins, done — I've opened #285 to track this so a human can implement it: #285
Note: I wasn't able to set you as the assignee due to a permissions error on the integration's GraphQL mutation, so you'll need to assign it manually.
🐰📝
🧠 Learnings used
Learnt from: k-chrispens
Repo: diff-use/sampleworks PR: 73
File: src/sampleworks/core/samplers/protocol.py:1-18
Timestamp: 2026-02-02T19:13:51.594Z
Learning: In Python files across the repo, do not suggest or add inline or file-level suppressions for Ruff F722 (jaxtyping dimension strings). Instead rely on a global ignore configured in pyproject.toml (as done by the sampleworks project). Verify that F722 is included in the global ignore, and avoid recommending per-file # ruff: noqa: F722 directives unless explicitly documented as an exception.
Learnt from: marcuscollins
Repo: diff-use/sampleworks PR: 132
File: src/sampleworks/utils/guidance_script_utils.py:585-586
Timestamp: 2026-03-05T16:30:40.514Z
Learning: In Python code, if enums subclassing StrEnum are used (e.g., GuidanceType, StructurePredictor), they serialize to plain strings with json.dump and pickle without special handling. Do not flag these as non-serializable in code reviews. Treat StrEnum values as strings for JSON serialization and ensure tests cover that behavior; no extra pickle handling needed.
| # Must be set before any `import protpardelle...` happens. Respect an | ||
| # externally configured directory (e.g. when real weights are available). | ||
| os.environ.setdefault( | ||
| "PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_") | ||
| ) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Avoid allocating a temp dir inside setdefault().
setdefault() evaluates tempfile.mkdtemp(...) eagerly, so this creates an untracked temp directory on every import even when PROTPARDELLE_MODEL_PARAMS is already set. The same pattern appears again in tests/models/protpardelle/test_protpardelle_wrapper.py.
Suggested fix
-os.environ.setdefault(
- "PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_")
-)
+_MODEL_PARAMS_DIR = None
+if "PROTPARDELLE_MODEL_PARAMS" not in os.environ:
+ _MODEL_PARAMS_DIR = tempfile.TemporaryDirectory(
+ prefix="protpardelle_model_params_"
+ )
+ os.environ["PROTPARDELLE_MODEL_PARAMS"] = _MODEL_PARAMS_DIR.name📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Must be set before any `import protpardelle...` happens. Respect an | |
| # externally configured directory (e.g. when real weights are available). | |
| os.environ.setdefault( | |
| "PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_") | |
| ) | |
| # Must be set before any `import protpardelle...` happens. Respect an | |
| # externally configured directory (e.g. when real weights are available). | |
| _MODEL_PARAMS_DIR = None | |
| if "PROTPARDELLE_MODEL_PARAMS" not in os.environ: | |
| _MODEL_PARAMS_DIR = tempfile.TemporaryDirectory( | |
| prefix="protpardelle_model_params_" | |
| ) | |
| os.environ["PROTPARDELLE_MODEL_PARAMS"] = _MODEL_PARAMS_DIR.name |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/models/protpardelle/conftest.py` around lines 17 - 21, The
`os.environ.setdefault(...)` usage is eagerly creating a temp directory via
`tempfile.mkdtemp(...)` even when `PROTPARDELLE_MODEL_PARAMS` is already set, so
update the setup in `conftest.py` to only call `mkdtemp` when the env var is
missing. Apply the same lazy pattern in `test_protpardelle_wrapper.py` where the
same `setdefault`/`mkdtemp` usage appears, using the existing
`PROTPARDELLE_MODEL_PARAMS` guard to avoid allocating untracked temp dirs on
import.
There was a problem hiding this comment.
There was a problem hiding this comment.
@marcuscollins Sounds good — thanks for tracking this in #286. I'll leave it as-is for this PR and won't insist further here.
🧠 Learnings used
Learnt from: k-chrispens
Repo: diff-use/sampleworks PR: 73
File: src/sampleworks/core/samplers/protocol.py:1-18
Timestamp: 2026-02-02T19:13:51.594Z
Learning: In Python files across the repo, do not suggest or add inline or file-level suppressions for Ruff F722 (jaxtyping dimension strings). Instead rely on a global ignore configured in pyproject.toml (as done by the sampleworks project). Verify that F722 is included in the global ignore, and avoid recommending per-file # ruff: noqa: F722 directives unless explicitly documented as an exception.
Learnt from: marcuscollins
Repo: diff-use/sampleworks PR: 132
File: src/sampleworks/utils/guidance_script_utils.py:585-586
Timestamp: 2026-03-05T16:30:40.514Z
Learning: In Python code, if enums subclassing StrEnum are used (e.g., GuidanceType, StructurePredictor), they serialize to plain strings with json.dump and pickle without special handling. Do not flag these as non-serializable in code reviews. Treat StrEnum values as strings for JSON serialization and ensure tests cover that behavior; no extra pickle handling needed.
Learnt from: DorisMai
Repo: diff-use/sampleworks PR: 272
File: tests/rewards/test_reward_function_contract.py:39-44
Timestamp: 2026-06-30T19:29:37.684Z
Learning: In this repository’s pytest test suite, do not automatically apply `pytest.mark.slow` just because a test requires a GPU. Use `pytest.mark.gpu` for GPU-dependent tests, and reserve `pytest.mark.slow` for tests that are genuinely slow based on measured per-test runtime (e.g., tests like `tests/rewards/test_reward_function_contract.py` may keep only `pytest.mark.gpu` when they run quickly). If a GPU test is fast, it should not be marked `slow`; if it is slow, add `slow` in addition to `gpu`.
| assert argv[:5] == ["pixi", "run", "-e", "rf3", "python"] | ||
| assert "run_grid_search.py" in argv[5] # the exact path varies by invocation |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Make this argv assertion hermetic against baked env detection.
runner._build_argv() switches to a direct Python executable whenever a baked pixi env exists, so this test can still fail on machines that already have .pixi/envs/rf3/bin/python even though the script-path check is now looser. Force the pixi path in the test setup before asserting the ["pixi", "run", ...] prefix.
Suggested hardening
monkeypatch.setenv("SAMPLEWORKS_FORCE_PIXI", "1")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/runs/test_runner.py` around lines 29 - 30, The argv assertion in the
test is not hermetic because runner._build_argv() can switch to a direct Python
executable when a baked pixi env is detected. Update the test setup for the
affected assertion in test_runner.py to force the pixi code path before checking
the ["pixi", "run", "-e", "rf3", "python"] prefix, using the existing test
fixture/monkeypatch around runner._build_argv() so the environment detection
cannot change the expected argv.
There was a problem hiding this comment.
@coderabbitai Please make an issue for this but do not implement it. Please mention @manzuoni-astera and @xraymemory and tag the issue "P1" and "engineering"
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
repo="diff-use/sampleworks"
gh issue create \
--repo "$repo" \
--title "Harden rf3_partial argv test against baked pixi env detection" \
--body "## Summary
The argv assertion in \`tests/runs/test_runner.py\` (\`test_argv_for_rf3_partial_matches_bash\`, around Line 29) was relaxed to check the \`run_grid_search.py\` script path by substring instead of an exact match. However, \`runner._build_argv()\` switches to a direct Python executable whenever a baked pixi env (e.g. \`.pixi/envs/rf3/bin/python\`) is detected on the machine running the tests. This means the test can still fail non-deterministically on machines/CI runners that already have a baked pixi env, even with the looser substring check, because the \`[\"pixi\", \"run\", -e\", \"rf3\", \"python\"]\` prefix assertion assumes the pixi code path is always taken.
## Required change
- In \`tests/runs/test_runner.py\`, force the pixi code path before asserting the \`[\"pixi\", \"run\", \"-e\", \"rf3\", \"python\"]\` prefix, e.g. via \`monkeypatch.setenv(\"SAMPLEWORKS_FORCE_PIXI\", \"1\")\` in \`test_argv_for_rf3_partial_matches_bash\`.
- Verify no other tests in the same file rely on the default (non-forced) baked-env detection behavior in a way that would be affected.
## Affected files
- \`tests/runs/test_runner.py\`
## Acceptance criteria
- \`test_argv_for_rf3_partial_matches_bash\` passes deterministically regardless of whether a baked pixi env exists on the test-running machine.
- No regression to other runner tests that intentionally exercise baked-env detection (e.g. \`test_uses_baked_env_python_when_available\`).
## References
- PR: https://github.com/diff-use/sampleworks/pull/274
- Discussion: https://github.com/diff-use/sampleworks/pull/274#discussion_r3493412921
Requested by: `@marcuscollins`
cc: `@manzuoni-astera` `@xraymemory`" \
--label "P1" \
--label "engineering"Length of output: 2092
@marcuscollins Done — I've opened #287 to track this without implementing it.
- Title: "Harden rf3_partial argv test against baked pixi env detection"
- Labels:
P1,engineering - Mentions:
@manzuoni-asteraand@xraymemory - Links back to this PR and discussion comment for context
🐇📝
🧠 Learnings used
Learnt from: k-chrispens
Repo: diff-use/sampleworks PR: 73
File: src/sampleworks/core/samplers/protocol.py:1-18
Timestamp: 2026-02-02T19:13:51.594Z
Learning: In Python files across the repo, do not suggest or add inline or file-level suppressions for Ruff F722 (jaxtyping dimension strings). Instead rely on a global ignore configured in pyproject.toml (as done by the sampleworks project). Verify that F722 is included in the global ignore, and avoid recommending per-file # ruff: noqa: F722 directives unless explicitly documented as an exception.
Learnt from: marcuscollins
Repo: diff-use/sampleworks PR: 132
File: src/sampleworks/utils/guidance_script_utils.py:585-586
Timestamp: 2026-03-05T16:30:40.514Z
Learning: In Python code, if enums subclassing StrEnum are used (e.g., GuidanceType, StructurePredictor), they serialize to plain strings with json.dump and pickle without special handling. Do not flag these as non-serializable in code reviews. Treat StrEnum values as strings for JSON serialization and ensure tests cover that behavior; no extra pickle handling needed.
Learnt from: DorisMai
Repo: diff-use/sampleworks PR: 272
File: tests/rewards/test_reward_function_contract.py:39-44
Timestamp: 2026-06-30T19:29:37.684Z
Learning: In this repository’s pytest test suite, do not automatically apply `pytest.mark.slow` just because a test requires a GPU. Use `pytest.mark.gpu` for GPU-dependent tests, and reserve `pytest.mark.slow` for tests that are genuinely slow based on measured per-test runtime (e.g., tests like `tests/rewards/test_reward_function_contract.py` may keep only `pytest.mark.gpu` when they run quickly). If a GPU test is fast, it should not be marked `slow`; if it is slow, add `slow` in addition to `gpu`.
## Summary - add Protpardelle-specific CLI flags for config path and sampling controls - forward config path into ProtpardelleWrapper construction - forward sampling options into annotate_structure_for_protpardelle - keep guidance --help lightweight by lazy-loading runtime dependencies Fixes #258 ## Validation - uvx ty check src/sampleworks/utils/guidance_script_arguments.py src/sampleworks/utils/guidance_script_utils.py src/sampleworks/cli/guidance.py tests/cli/test_guidance_cli.py tests/models/protpardelle/test_protpardelle_wrapper.py - uvx ruff check src/sampleworks/utils/guidance_script_arguments.py src/sampleworks/utils/guidance_script_utils.py src/sampleworks/cli/guidance.py tests/cli/test_guidance_cli.py tests/models/protpardelle/test_protpardelle_wrapper.py - PYTHONPATH=src uvx --python 3.12 --with pytest pytest --noconftest tests/cli/test_guidance_cli.py -q - PYTHONPATH=src uvx --python 3.12 --with pytest pytest --noconftest tests/models/protpardelle/test_protpardelle_wrapper.py -q (skipped locally: Protpardelle not installed) --------- Co-authored-by: Marcus D. Collins <marcus.collins@astera.org> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/sampleworks/core/samplers/edm.py (1)
416-428: 🗄️ Data Integrity & Integration | 🔴 Critical | 🏗️ Heavy lift
model_wrapper.step()now breaks non-Protpardelle wrappers
src/sampleworks/core/samplers/edm.py:425-428passesepsas a third positional argument, butBoltz,Protenix, andRF3still definestep(x_t, t, *, features=...). That will raiseTypeErroron those sampling paths. Update the protocol and wrapper signatures together, or gate this call to the Protpardelle-only implementation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/sampleworks/core/samplers/edm.py` around lines 416 - 428, The `model_wrapper.step()` call in the EDM sampler is now passing `eps` positionally, which breaks wrappers like `Boltz`, `Protenix`, and `RF3` that still only accept `step(x_t, t, *, features=...)`. Fix this by updating the shared protocol and the wrapper implementations together so their `step` signatures match the new call, or by restricting the extra `eps` argument to the Protpardelle-specific path. Check the `model_wrapper.step()` invocation in the EDM sampler and the corresponding `step` methods on the wrapper classes to keep the interface consistent.
🧹 Nitpick comments (1)
src/sampleworks/cli/guidance.py (1)
13-16: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winRedundant local
loggerimport duplicates the new top-level import.A top-level
from loguru import loggeris already added at the module level (lines 6-7), making this local re-import insidemain()dead code. Lazy-loadingget_model_and_device/run_guidanceto keep--helplight makes sense, butloguruis not a heavy dependency, so duplicating the import adds noise without benefit.♻️ Proposed fix
def main(argv: list[str] | None = None) -> int: config = GuidanceConfig.from_cli(argv) - from loguru import logger - from sampleworks.utils.guidance_script_utils import get_model_and_device, run_guidance logger.info(f"Running guidance with config: {config}")As per coding guidelines, "No dead code and no compatibility shims for hypothetical users."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/sampleworks/cli/guidance.py` around lines 13 - 16, The local re-import of logger inside main() is redundant because logger is already imported at module scope. Remove the inner from loguru import logger from main() and keep the lazy import only for get_model_and_device and run_guidance so guidance.py uses the existing top-level logger symbol consistently.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/sampleworks/core/samplers/edm.py`:
- Around line 416-428: The `model_wrapper.step()` call in the EDM sampler is now
passing `eps` positionally, which breaks wrappers like `Boltz`, `Protenix`, and
`RF3` that still only accept `step(x_t, t, *, features=...)`. Fix this by
updating the shared protocol and the wrapper implementations together so their
`step` signatures match the new call, or by restricting the extra `eps` argument
to the Protpardelle-specific path. Check the `model_wrapper.step()` invocation
in the EDM sampler and the corresponding `step` methods on the wrapper classes
to keep the interface consistent.
---
Nitpick comments:
In `@src/sampleworks/cli/guidance.py`:
- Around line 13-16: The local re-import of logger inside main() is redundant
because logger is already imported at module scope. Remove the inner from loguru
import logger from main() and keep the lazy import only for get_model_and_device
and run_guidance so guidance.py uses the existing top-level logger symbol
consistently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f581f2ae-9bd1-4cb2-9dbc-72bd9be16919
📒 Files selected for processing (7)
src/sampleworks/cli/guidance.pysrc/sampleworks/core/samplers/edm.pysrc/sampleworks/models/protpardelle/wrapper.pysrc/sampleworks/utils/guidance_script_arguments.pysrc/sampleworks/utils/guidance_script_utils.pytests/cli/test_guidance_cli.pytests/models/protpardelle/test_protpardelle_wrapper.py
🚧 Files skipped from review as they are similar to previous changes (4)
- src/sampleworks/utils/guidance_script_arguments.py
- src/sampleworks/utils/guidance_script_utils.py
- tests/models/protpardelle/test_protpardelle_wrapper.py
- src/sampleworks/models/protpardelle/wrapper.py
k-chrispens
left a comment
There was a problem hiding this comment.
I saw some of the recent issues as I left work today and got sucked into reading the rest of this PR. If this is ready for review, these are my thoughts. The most important thing is that it doesn't appear that the modelwrapper step function is aligned with how it is called in the sampler. It would also be good if the protpardelle wrapper got plugged into the other modelwrapper integration tests in tests/conftest.py - I think several of these tests would fail in the current implementation
| getattr(config, "model_checkpoint", None), | ||
| config.model, | ||
| method=getattr(config, "method", None), | ||
| protpardelle_config_path=getattr(config, "protpardelle_config_path", None), |
There was a problem hiding this comment.
Is there a way we can remove this here, or at least make it more general rather than including a model specific arg here? I know there are a lot of models that use hydra YAML configs, so I expect we will probably need to generalize this at some point
| # I need to modify the Protocol itself. @Michael Anzuoni | ||
| with torch.set_grad_enabled(allow_gradients): | ||
| x_hat_0 = model_wrapper.step(noisy_state, t_hat, features=features) | ||
| x_hat_0 = model_wrapper.step(noisy_state, t_hat, eps, features=features) |
There was a problem hiding this comment.
I'm confused here, because this no longer appears to be needed in the model step function that is in this PR. So I think #283 should be closed.
| @@ -0,0 +1,766 @@ | |||
| """Wrapper for Protpardelle-1c models. | |||
|
|
|||
| Follows the ``StructureModelWrapper`` protocol in | |||
There was a problem hiding this comment.
This isn't correct - this shouldn't be following the StructureModelWrapper. It should be (and I believe is, though that would require an isinstance check) a FlowModelWrapper
| models (``task: "ai-allatom"``, e.g. the model defined by ``cc89.yaml``). | ||
| These models take a protein sequence as their only conditioning input and | ||
| generate an all-atom structure by running the full reverse-diffusion | ||
| trajectory internally. Because a single ``step`` call encapsulates the entire |
There was a problem hiding this comment.
Also confused by this - the step call shouldn't be encapsulating the whole diffusion loop (and currently isn't)
|
|
||
| # To understand all these arguments better, you can study the (complicated) | ||
| # Protpardelle.sample method. Hints: partial diffusion is enabled and cc.enabled = False! | ||
| # https://github.com/ProteinDesignLab/protpardelle-1c/src/protpardelle/core/models.py |
There was a problem hiding this comment.
This link doesn't exist. I am assuming you mean this? https://github.com/ProteinDesignLab/protpardelle-1c/blob/ee378400f25b801fa481028000f9060183d7fb4c/src/protpardelle/core/models.py#L740
There was a problem hiding this comment.
| default=1.0, | ||
| help="Temperature applied to Protpardelle aatype logits", | ||
| ) | ||
| parser.add_argument( |
There was a problem hiding this comment.
We shouldn't need this since we will never be doing design, right?
| structure, | ||
| ensemble_size=args.ensemble_size, | ||
| num_steps=args.num_diffusion_steps, | ||
| s_churn=getattr(args, "protpardelle_s_churn", 40.0), |
There was a problem hiding this comment.
I imagine several of these are not going to be adjusted - not that we want to restrict the number of knobs, but especially the design arguments don't seem relevant to include here. Also, several of these are not consumed anywhere aside from in the annotation on the structure.
| str(template_job.device), | ||
| template_job.model_checkpoint, | ||
| template_job.model, | ||
| template_job.model, # this is not actually the model, it's the model name, e.g. boltz2 |
There was a problem hiding this comment.
should we just rename this to model_name?
| atoms implied by the input sequence (the ``atom_mask`` in the | ||
| conditioning). | ||
|
|
||
| Parameters |
There was a problem hiding this comment.
This docstring isn't correct
| self, | ||
| x_t: Float[Tensor, "batch atoms 3"], | ||
| t: Float[Tensor, "*batch"] | float, | ||
| sigma_float: float, |
There was a problem hiding this comment.
unused parameter that should be removed
## Summary Fixes #283 and unbreaks CI on #274 (`mdc/add-protpardelle`). The EDM sampler called `model_wrapper.step(noisy_state, t_hat, eps, features=features)`, passing a **third positional `eps`** that is not part of the `FlowModelWrapper.step(x_t, t, *, features=None)` protocol. Because boltz/protenix/rf3 (and the test mocks) implement the protocol exactly — no extra positional, no `**kwargs` — every sampler-path test failed with: ``` TypeError: <Wrapper>.step() takes 3 positional arguments but 4 positional arguments (and 1 keyword-only argument) were given src/sampleworks/core/samplers/edm.py:428 ``` That single call site accounts for **all 164 failing tests** across the `boltz-dev` / `protenix-dev` / `rf3-dev` jobs (110 `MismatchCaseWrapper` + 54 `MockFlowModelWrapper`). The extra argument was only ever absorbed by Protpardelle's `step()`, where it landed in a `sigma_float` parameter that is **never referenced** in the method body — so removing it changes no behavior. This matches @k-chrispens's note on #283 (already addressed by #267; `eps` is the noise tensor, not a noise level) and restores the resolution that was lost when the branch was force-updated. ## Changes **Protocol reconciliation (#283)** - `core/samplers/edm.py`: drop the extra `eps` positional from the `model_wrapper.step()` call. `eps` is still computed locally and used to build the noisy state and the working-frame guidance math — only the (dead) hand-off to the wrapper is removed. Deleted the temporary "I need to modify the Protocol itself" TODO. - `models/protpardelle/wrapper.py`: remove the unused `sigma_float` parameter so `step()` matches `FlowModelWrapper`. **Lint job (was failing, 8 errors)** - `cli/guidance.py`: remove unused top-level `from loguru import logger` (it's re-imported inside `main()`) — fixes F401 / F811 / I001. - `models/protpardelle/wrapper.py`: use the repo's existing leading-space single-axis jaxtyping convention (`Int[Tensor, " atoms"]`, as in `edm.py`/`step_scalers.py`) to avoid UP037 / F821. - `tests/runs/test_runner.py`: wrap an over-length inline comment (E501). ## Validation - `ruff 0.15.8 check` passes on all touched files locally (same version as CI). - Test suites can't run on this macOS checkout (pixi envs are linux-64); CI on this PR exercises them. Every failing test traced to the removed `edm.py:428` positional and the tests already call `step(x, t, features=...)`, so they should go green. Closes #283. Co-authored-by: xraymemory <me.anzuoni@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pyproject.toml`:
- Around line 87-98: The concern is not the gemmi override; instead, the package
metadata is missing a direct dependency declaration for gemmi even though
generate_synthetic_sf.py imports it. Update the [project.dependencies] list in
pyproject.toml to include gemmi so the import is represented in the declared
requirements, and leave the existing
[tool.pixi.pypi-options.dependency-overrides] entry unchanged since it already
applies to the resolved gemmi version.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2bf5eeeb-b6dd-4fea-8d50-7c20601ab47f
⛔ Files ignored due to path filters (1)
pixi.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
pyproject.tomlsrc/sampleworks/cli/guidance.pysrc/sampleworks/core/samplers/edm.pysrc/sampleworks/models/protpardelle/wrapper.pytests/runs/test_runner.py
💤 Files with no reviewable changes (1)
- src/sampleworks/cli/guidance.py
✅ Files skipped from review due to trivial changes (1)
- src/sampleworks/core/samplers/edm.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/runs/test_runner.py
- src/sampleworks/models/protpardelle/wrapper.py
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pyproject.toml`:
- Around line 87-98: The concern is not the gemmi override; instead, the package
metadata is missing a direct dependency declaration for gemmi even though
generate_synthetic_sf.py imports it. Update the [project.dependencies] list in
pyproject.toml to include gemmi so the import is represented in the declared
requirements, and leave the existing
[tool.pixi.pypi-options.dependency-overrides] entry unchanged since it already
applies to the resolved gemmi version.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2bf5eeeb-b6dd-4fea-8d50-7c20601ab47f
⛔ Files ignored due to path filters (1)
pixi.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
pyproject.tomlsrc/sampleworks/cli/guidance.pysrc/sampleworks/core/samplers/edm.pysrc/sampleworks/models/protpardelle/wrapper.pytests/runs/test_runner.py
💤 Files with no reviewable changes (1)
- src/sampleworks/cli/guidance.py
✅ Files skipped from review due to trivial changes (1)
- src/sampleworks/core/samplers/edm.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/runs/test_runner.py
- src/sampleworks/models/protpardelle/wrapper.py
🛑 Comments failed to post (1)
pyproject.toml (1)
87-98: 🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash set -euo pipefail python - <<'PY' import tomllib, pathlib data = tomllib.loads(pathlib.Path("pyproject.toml").read_text()) pixi = data["tool"]["pixi"] print("Top-level dependency-overrides:", pixi.get("pypi-options", {}).get("dependency-overrides", {})) proto = pixi.get("feature", {}).get("protpardelle", {}) print("protpardelle deps:", proto.get("dependencies", {})) print("protpardelle pypi-deps:", proto.get("pypi-dependencies", {})) PY echo "--- occurrences of dependency-overrides header ---" grep -n '\[tool\.pixi\.pypi-options\.dependency-overrides\]' pyproject.toml echo "--- gemmi references across repo ---" rg -n '\bgemmi\b' .Repository: diff-use/sampleworks
Length of output: 9158
Remove the concern about a no-op override;
gemmiis already part of the resolved environment.
pyproject.tomlpinsgemmiin[tool.pixi.pypi-options.dependency-overrides], andpixi.lockalready containsgemmi==0.6.7, so the override has something to apply to. The remaining issue is thatgenerate_synthetic_sf.pyimportsgemmidirectly even though it is not declared in[project.dependencies], which can make the package metadata incomplete.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pyproject.toml` around lines 87 - 98, The concern is not the gemmi override; instead, the package metadata is missing a direct dependency declaration for gemmi even though generate_synthetic_sf.py imports it. Update the [project.dependencies] list in pyproject.toml to include gemmi so the import is represented in the declared requirements, and leave the existing [tool.pixi.pypi-options.dependency-overrides] entry unchanged since it already applies to the resolved gemmi version.
This is a working version of Protpardelle-1c in SampleWorks. It may still require parameter tuning and other updates--in particular it isn't clear how to use self-conditioning. However we can generate structures with this version and so I'm making this PR as a milestone.
Only a couple significant changes have been made outside the Protpardelle wrapper code itself. One is to make a features class that is unfrozen, so that we can pass self-conditioning input forward to the next Euler step during sampling. Another is that we define different sampling parameters for Protpardelle when instantiating the sampler. There is no CLI control for either change in this PR.
Summary by CodeRabbit
cc89_epoch415configuration and Protpardelle wrapper functionality for sampling/initialization.